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A  ^  (NUAjmum  (XJwora)  yg  introduced  a  signal  representations  to  detect  and  analyze 
transients.  A  new  algorithm  called  matching  pursuit  was  developed  to  decompense  any 
signal  as  a  sum  of  waveforms  that  are  chosen  adaptively  from  a  redundant  dictionary 
of  patterns,  in  order  to  best  match  the  signal  structures.  This  algorithm  was  applied 
to  dictionary  of  dilated  Gabor  functions  in  order  to  characterize  oscillatory  transiel 
of  various  sizes  and  frequencies.  The  asymptotic  properties  of  this  algorithm  have 
been  analyzed  and  we  proved  the  existence  of  an  attractor.  This  led  to  a  general  noii 
removal  procedure  which  has  been  applied  to  audio  signals.  A  fast  matching  pursuit 
algorithm  was  also  designed  and  implemented  in  a  software  that  is  freely  available  on 
internet.  The  matching  pursuit  algorithm  has  been  extended  in  two  dimensions  for  imai 
processing.  The  image  dictionary  is  composed  of  translated,  dilated,  and  rotated 
wavelets.  Applications  to  texture  discrimination  have  been  studied.  To  isolate  pattl 
whose  support  may  intersect,  we  have  introduced  a  high  resolution  pursuit  algorithm  ^ 
which  was  used  to  decompose  high  resolution  radar  signals. 
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1  Introduction 


The  complexity  of  structures  encountered  in  ATR  signals  require  to  develop 
adaptive  low-level  representations.  Although  these  signals  are  entirely  char¬ 
acterized  by  their  decomposition  in  a  basis,  a  basis  is  a  minimal  set  of  vectors 
that  is  not  rich  enough  to  represent  efficiently  all  components.  Some  signal 
structures  are  diffused  across  many  basis  elements  and  are  thus  difficult  to 
analyze  from  this  expansion.  For  example,  image  variations  corresponding  to 
edges  and  textures  are  not  efficiently  represented  by  the  same  types  of  wave¬ 
forms.  The  same  issue  appears  in  sounds  that  includes  transients  that  are 
well  represented  by  short  waveforms,  and  harmonics  that  are  more  efficiently 
decomposed  over  long  waveforms  with  short  frequency  support.  Instead  of 
decomposing  all  signals  over  the  same  family  of  waveforms,  we  introduced  an 
adaptive  transforms  choose  the  decomposition  vectors  depending  upon  the 
signal  properties.  These  vectors  are  selected  among  a  family  of  waveforms 
that  is  much  larger  than  a  basis,  which  is  called  a  dictionary. 

In  this  study  we  have  developed  an  algorithm  called  matching  pursuit 
algorithm  (section  2),  which  decomposes  signals  over  dictionary  vectors  that 
are  selected  with  a  greedy  strategy.  Most  of  the  signal  energy  can  be  ap¬ 
proximated  with  few  dictionary  vectors,  which  can  be  interpreted  as  essential 
signal  features.  The  application  of  matching  pursuit  to  sounds  (section  3)  and 
images  (section  5)  have  been  developed  with  dictionaries  of  time-frequency 
atoms  and  wavelets.  The  asymptotic  properties  of  the  pursuit  have  shown 
the  existence  of  a  chaotic  attractor  that  we  used  for  noise  removal  (section 
chaos).  To  isolate  features  whose  support  overlap,  in  collaboration  with  Alan 
Willsky  group  at  MIT,  we  have  developed  a  high  resolution  pursuit  algorithm 
which  can  segment  features  closely  spaced  (section  7). 

The  extraction  of  information  from  signals  also  requires  to  analyze  struc¬ 
tures  that  are  better  modeled  in  a  stochastic  framework.  For  stochastic  sig¬ 
nals,  we  are  not  interested  by  the  exact  behavior  of  a  particular  realization 
but  we  want  to  identify  the  underlined  process.  One  or  few  realizations  give 
very  little  information  about  the  underlying  process.  We  thus  concentrate  on 
second  order  moment  properties.  A  new  algorithm  to  estimate  the  covariance 
of  non-stationary  processes  has  been  introduced  in  collaboration  with  Pfr. 
Papanicolaou. 
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By  iterating  this  decomposition  up  to  the  order  m,  we  can  decompose  /  into 
the  telescoping  sum 

m~l 

/  =  E  (7) 

n=0 

Equation  (5)  yields 

m-1 

f=J2  <R"l9^n>9y„  +  R'^f.  (8) 

n=0 

Similarly,  we  write  |/p  as  a  telescoping  sum 

m—l 

Iff  =  E  {iR'^ff  -  IR^^^ff)  +  m’^ff  (9) 

n=0 

which  we  combine  with  (6)  to  obtain  an  energy  conservation  equation 

m  — 1 

Iff  =  E  \<R"f,9-yn  >f  +  \\R^ff.  _  (10) 

n=0 

A  matching  pursuit  decompose  any  /  into  a  sum  of  dictionary  elements 
which  are  chosen  to  best  match  its  residues.  Although  this  decomposition 
is  non-linear,  we  maintain  an  energy  conservation  as  though  it  was  a  linear, 
orthogonal  decomposition.  An  important  issue  was  to  understand  the  behav¬ 
ior  of  the  residue  f  when  m  increases.  We  proved  [l]  that  the  matching 
pursuit  converges,  even  in  infinite  dimensional  spaces. 
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This  theorem  proves  that  any  vector  /  is  characterized  by  the  double 
sequence  (<  >)7n)n6N)  which  specifies  the  expansion  coefficients 

and  the  index  of  each  chosen  vector  within  the  dictionary.  After  m  iterations, 
(8)  shows  that  the  approximation  error  is 

.  m— 1 

E  <-RV,57n>ff-yn-  (14) 

n=0 

The  best  approximation  of  /  as  a  a  linear  expansion  of  {p7„}o<n<m  is  the 
orthogonal  projection  of  /  on  the  space  generated  by  this  family  of  vectors. 
In  general  the  vectors  {fii7„}o<n<m  are  not  orthogonal  so  the  matching  pursuit 
expansion  is  not  equal  to  the  orthogonal  projection  of  /.  An  improved  approx¬ 
imation  was  introduced  in  collaboration  with  Geoff  Davis,  by  orthogonalizing 
the  family  {57„}o<n<m  with  a  Gram-Schmidt  procedure  and  computing  the 
orthogonal  projection  of  /  [4].  Such  an  orthogonal  pursuit  gives  the  better 
approximations  at  the  cost  of  an  increase  computational  complexity. 

3  Sound  Pursuit 

To  analyze  the  time  and  frequency  localization  properties  of  one-dimensional 
oscillatory  signals  such  as  speech,  Zhifeng  Zhang  [1]  used  a  large  dictionary 
of  time-frequency  atoms.  Our  signal  space  is  L^(R)  and  we  construct  such 
a  dictionary  by  scaling,  translating  and  modulating  a  single  window  function 
g{t)  £  L^(R).  We  suppose  that  g{t)  is  an  even  and  real  function  of  unit 
norm.  For  any  scale  s  >  0,  frequency  modulation  ^  and  translation  u,  we 
denote  7  =  (s,  u,  and  define 

gy{t)  =  (15) 

The  index  7  is  an  element  of  the  set  T  =  R+  x  R^.  The  factor  -4=  normalizes 

v/s 

to  1  the  norm  of  g.y{t).  The  function  g~f{t)  is  centered  at  the  abscissa  u  and 
its  energy  is  concentrated  in  a  neighborhood  of  u,  whose  size  is  proportional 
to  s.  Let  g{u>)  be  the  Fourier  transform  of  g{t).  Equation  (15)  yields 

^7(0;)  =  ^^g{s{u;  -  e))e-‘(-«)“.  (16) 

Since  |5(u;)|  is  even,  |57(w)|  is  centered  at  the  frequency  u  =  ^.  Its  energy 
is  concentrated  in  a  neighborhood  of  whose  size  is  proportional  to  4.  The 
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dictionary  of  time-frequency  atoms  V  =  {ff-y(0}'Y€r  is  a  very  redundant 
set  of  functions  in  L^(R)  that  includes  window  Fourier  frames  and  wavelet 
frames. 

A  matching  pursuit  chooses  the  time-frequency  atoms  of  V  which  are 
“best”  adapted  to  expand  /.  Since  a  time-frequency  atom  dictionary  is  com¬ 
plete,  Theorem  1  proves  that 

+00 

/  =  ^  >5^„,  (17) 

n=0 

where  7„  =  (s„,  u„,  ^n)  and 

=  (18) 
VS„  Sn 

Numerical  study  have  been  performed  on  speech  and  audio  signals  [1],  [4]. 


4  Chaotic  Attractor  and  Noise  Removal 


The  asymptotic  convergence  of  a  matching  pursuit  has  further  been  studied 
by  analyzing  the  behavior  of  the  normalized  residue 


^  J 


The  non-linear  map  defined  by  =  M{Rf)  exhibits  chaotic  proper¬ 

ties.  Experimental  data  suggest  that  the  normalized  residues  of  a  normalized 
pursuit  converge  to  a  chaotic  attractor.  In  low-dimensional  spaces,  Geoff 
Davis  [3]  proved  that  M  is  topologically  equivalent  to  a  left-shift  map  opera¬ 
tor,  whose  chaotic  properties  are  entirely  known.  In  high  dimensional  spaces, 
the  analysis  was  performed  for  a  particular  dictionary  of  Diracs  and  complex 
exponentials.  Numerically,  one  can  observe  that  the  first  few  iterations  of 
the  pursuit  extracts  the  components  of  /  which  are  strongly  correlated  with 
dictionary  vectors,  which  we  call  coherent  part.  The  remaining  residue  does 
not  correlate  strongly  to  any  dictionary  vectors  and  its  properties  depend 
upon  the  attractor  of  the  chaotic  map.  We  call  it  a  “dictionary  noise”  For 
dictionaries  of  time-frequency  atoms  these  residues  converge  to  realizations 
of  white  noises  [3]. 

By  tracking  the  convergence  of  the  residue  to  the  dictionary  noise  attrac¬ 
tor,  we  can  isolate  the  coherent  signal  components  that  are  well  approximated 
by  few  dictionary  vectors.  Applications  to  noise  removal  from  speech  signals 
have  been  studied  by  Geoff  Davis  [3]. 
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5  Image  Pursuit 

For  image  processing,  we  must  select  a  dictionary  that  can  characterize  the 
local  scale  and  orientation  of  the  image  variations.  For  this  purpose,  Francois 
Bergeaud  [7]  introduced  a  dictionary  composed  of  several  two-dimensional 
wavelets  that  have  specific  orientation  selectivity.  These  wavelets  are  all  de¬ 
rived  from  a  two-dimensional  window  g{x,y)  that  is  modulated  at  a  fixed 
frequency  uq  along  several  directions  specified  by  an  angle  0  in  the  (i,y) 
plane 

96it)  =5(x,y)exp[i(xcos0-l-ysin0)]. 

These  oriented  wavelets  are  then  scaled  by  s  and  translated  to  define  a  whole 
family  of  wavelets  {y'y}7gr  with: 

1  ,x  —  u  y  -  V.  ,  , 

g^{x,y)=  -ge{— (19) 
s  s  s 

The  multhparameters  index  7  =  {6,  5,  u,  v)  carries  the  orientation,  scale  and 
position  of  the  corresponding  wavelet. 

In  numerical  computations,  the  scale  is  restricted  {2^}j^z  a^nd  the  an¬ 
gles  are  discretized.  This  wavelet  dictionary  used  in  the  numerical  exam¬ 
ples  include  8  orientations.  The  matching  pursuit  algorithm  applied  to  this 
wavelet  dictionary  selects  iteratively  the  wavelets,  whose  scales,  orientations 
and  positions  best  match  the  local  image  variations.  Applications  to  texture 
discrimination  and  noise  removal  have  been  developed  [7]. 

6  Fast  Numerical  Computations 

At  a  first  glance,  a  matching  pursuit  seems  to  require  a  hopeless  amount 
of  computations.  These  computations  can  however  be  considerably  reduced 
with  an  efficient  algorithm  that  prunes  the  dictionary  with  local  maxima  [1], 
[7].  For  /  G  H,  we  call  a  local  maxima  in  the  parameter  space  F  an  index  70 
such  that  for  all  7  in  a  neighborhood  of  70  in  T 

\<  f,9-y>\<\<  f,g',o>\-  (20) 

For  example,  in  a  Gabor  dictionary  of  one-dimensional  time  frequency 
atoms,  the  local  maxima  are  computed  for  fixed  scale.  For  each  scale  s,  the 
local  maxima  are  defined  as  indexes  70  =  (s,  uq,  ^0)  such  that  (20)  is  valid  for 
any  7  =  (s,u,^)  with  (u,^)  in  a  neighborhood  of  (uoiCo)- 
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At  the  step  1  of  the  algorithm  we  prune  the  dictionary  with  a  local  maxima 
selection.  All  inner  products  {<  f,g^  >}7€r  are  computed.  We  choose  a 
threshold  e  and  select  only  the  local  maxima  that  are  large  enough 

l</,s,>l>f|l/ll- 

The  matching  pursuit  is  then  computed  by  induction  as  follow. 

Suppose  that  the  first  n  vectors  {5-Tfc}o<;fe<n  have  been  selected.  We  denote 
by  r„  the  indexes  7  such  that  |  <  /,  <7-^  >  |  is  a  local  maxima  and  |  < 
•R"/)57o  >  I  ^  ^11/11-  We  find  which  correlates  i?"/  at  best  in  this 
reduced  dictionary 

I  <  >  i  =  sup  i  <  >  |. 

7€r„ 

We  compute  the  inner  product  of  the  new  residue  with  all 

with  an  updating  formula  derived  from  equation  (5) 

<  >=<  >  -  <  ^"/,57n  >  <9ini9-y>  ■  (21) 

Since  we  previously  stored  <  R^f,g-y  >  and  <  R^f,g-y^  >,  this  update  is 
obtained  in  0(1)  operations  if  the  value  <  g-y„,g-y  >  can  be  retrieved  in  0(1) 
operations.  This  is  the  case  for  the  Gabor  dictionary  of  one-dimensional  time- 
frequency  atoms  and  the  dictionary  of  two-dimensional  wavelets.  The  vectors 
in  these  dictionaries  have  a  sparse  interaction  which  means  that  for  most 
7  €  r„,  we  have  <  g~,„,g-y  >=  0.  There  are  thus  few  indexes  7  for  which  the 
value  of  <  R^f,g-f  >  must  be  updated.  The  dictionary  is  further  pruned  by 
suppressing  from  r„  all  indexes  7  such  that  |  <  i?"+V)57  >  I  <  ^11/11-  The 
iteration  is  then  continued  on  this  new  index  set  r„+i. 

If  we  iterate  this  procedure,  the  index  r„  is  progressively  reduced  until 
it  gets  empty  for  n  =  m.  We  then  come  back  to  the  step  1  and  replace  /  by 
i?'”/.  The  local  maxima  of  <  R^f,g~^  >  are  computed  and  are  thresholded 
with  the  new  value  e||i?'"/||.  The  pursuit  is  then  continued  on  these  maxima 
with  the  iteration  previously  described,  until  the  index  set  is  again  empty  for 
n  ■=  p.  We  come  back  again  to  step  1  by  replacing  /  by  RPf  and  continue 
the  iterations.  A  software  implementing  matching  pursuit  for  time-frequency 
dictionaries  is  available  through  anonymous  ftp  at  the  address  cs.nyu.edu  . 
Instructions  are  in  the  file  README  of  the  directory  /pub/wave/software. 


7  High  Resolution  Pursuit 

The  matching  pursuit  is  a  greedy  strategy  which  does  not  use  any  “look¬ 
ahead”  for  selecting  the  dictionary  vectors.  When  the  features  have  a  support 
that  intersect,  this  can  induce  a  selection  of  dictionary  vector  that  is  not 
optimal.  In  collaboration  with  Pfr.  Alan  Willsky  group  at  MIT,  we  developed 
a  high  resolution  pursuit  algorithm  that  uses  the  same  greedy  strategy  but 
which  replaces  the  optimization  of  and  L^(R)  correlation  by  a  non-linear 
measure  which  is  more  sensitive  to  the  local  fit  of  features  [9]. 

The  high  resolution  pursuit  algorithm  was  developed  for  two  types  of  dic¬ 
tionaries.  We  used  a  dictionary  of  dilated  and  translated  box  spline  functions 
to  decompose  high  resolution  radar  signals.  These  features  are  then  used  for 
classification.  We  also  developed  an  application  to  the  detection  of  transitions 
in  sounds  by  using  a  dictionary  of  wavepackets  [9]. 

8  Estimation  of  Covariance 

For  general  non-stationary  processes  the  covariance  matrix  cannot  be  esti¬ 
mated  reliably  from  a  few  realizations  of  the  process.  However,  if  we  can  find 
a  basis  in  which  the  covariance  operator  is  well  approximated  by  a  sparse 
matrix,  it  is  possible  to  reduce  substantially  the  variance  by  estimating  only 
the  (essentially)  non-zero  matrix  elements.  It  is  thus  necessary  to  estimate 
from  the  data  the  basis  in  which  the  covariance  operator  is  well  approximated 
by  a  sparse  matrix,  as  well  as  the  non-zero  matrix  elements.  A  best  basis 
search  algorithm  has  been  introduced  to  compress  the  covariance  operator. 

Locally  stationary  processes  appear  in  many  physical  systems  in  which  the 
mechanisms  that  produce  random  fluctuations  change  slowly  in  time  or  space. 
Over  short  time  intervals,  such  processes  can  be  approximated  by  a  stationary 
one.  This  is  the  case  for  many  components  of  speech  signals.  We  have  shown 
that  the  covariance  operator  of  such  processes  is  nearly  diagonalized  in  an 
appropriate  local  cosine  basis.  A  best  basis  algorithm  was  designed  to  search 
this  best  basis  and  estimate  the  compressed  covariance  matrix. 
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